4,737 research outputs found
Learning Determinantal Point Processes
Determinantal point processes (DPPs), which arise in random matrix theory and
quantum physics, are natural models for subset selection problems where
diversity is preferred. Among many remarkable properties, DPPs offer tractable
algorithms for exact inference, including computing marginal probabilities and
sampling; however, an important open question has been how to learn a DPP from
labeled training data. In this paper we propose a natural feature-based
parameterization of conditional DPPs, and show how it leads to a convex and
efficient learning formulation. We analyze the relationship between our model
and binary Markov random fields with repulsive potentials, which are
qualitatively similar but computationally intractable. Finally, we apply our
approach to the task of extractive summarization, where the goal is to choose a
small subset of sentences conveying the most important information from a set
of documents. In this task there is a fundamental tradeoff between sentences
that are highly relevant to the collection as a whole, and sentences that are
diverse and not repetitive. Our parameterization allows us to naturally balance
these two characteristics. We evaluate our system on data from the DUC 2003/04
multi-document summarization task, achieving state-of-the-art results
'Catastrophic Failure' Theories and Disaster Journalism: Evaluating Media Explanations of the Black Saturday Bushfires
In recent decades, academic researchers of natural disasters and emergency management have developed a canonical literature on 'catastrophe failure' theories such as disaster responses from from US emergency management services (Drabek, 2010; Quarantelli, 1998) and the Three Mile Island nuclear power plant (Perrow, 1999). This article examines six influential theories from this field in an attempt to explore why Victoria's disaster and emergency management response systems failed during Australia's Black Saturday bushfires. How well, if at all, are these theories understood by journalists, disaster and emergency management planners, and policy-makers? On examining the Country Fire Authority's response to the fires, as well as the media's reportage of them, we use the 2009 Black Saturday bushfires as a theory-testing case study of failures in emergency management, preparation and planning. We conclude that journalists can learn important lessons from academics' specialist knowledge about disaster and emergency management responses
A quantum Mirkovi\'c-Vybornov isomorphism
We present a quantization of an isomorphism of Mirkovi\'c and Vybornov which
relates the intersection of a Slodowy slice and a nilpotent orbit closure in
, to a slice between spherical Schubert varieties in the
affine Grassmannian of (with weights encoded by the Jordan types of the
nilpotent orbits). A quantization of the former variety is provided by a
parabolic W-algebra and of the latter by a truncated shifted Yangian. Building
on earlier work of Brundan and Kleshchev, we define an explicit isomorphism
between these non-commutative algebras, and show that its classical limit is a
variation of the original isomorphism of Mirkovi\'c and Vybornov. As a
corollary, we deduce that the W-algebra is free as a left (or right) module
over its Gelfand-Tsetlin subalgebra, as conjectured by Futorny, Molev, and
Ovsienko.Comment: v2: 48 pages. Major rewrite following referee comments. Added proof
of a conjecture of Futorny, Molev, and Ovsienko that the finite W-algebra is
free over its Gelfand-Tsetlin subalgebr
Three-Way Joins on MapReduce: An Experimental Study
We study three-way joins on MapReduce. Joins are very useful in a multitude
of applications from data integration and traversing social networks, to mining
graphs and automata-based constructions. However, joins are expensive, even for
moderate data sets; we need efficient algorithms to perform distributed
computation of joins using clusters of many machines. MapReduce has become an
increasingly popular distributed computing system and programming paradigm. We
consider a state-of-the-art MapReduce multi-way join algorithm by Afrati and
Ullman and show when it is appropriate for use on very large data sets. By
providing a detailed experimental study, we demonstrate that this algorithm
scales much better than what is suggested by the original paper. However, if
the join result needs to be summarized or aggregated, as opposed to being only
enumerated, then the aggregation step can be integrated into a cascade of
two-way joins, making it more efficient than the other algorithm, and thus
becomes the preferred solution.Comment: 6 page
Optimising Selective Sampling for Bootstrapping Named Entity Recognition
Training a statistical named entity recognition system in a new domain requires costly manual annotation of large quantities of in-domain data. Active learning promises to reduce the annotation cost by selecting only highly informative data points. This paper is concerned with a real active learning experiment to bootstrap a named entity recognition system for a new domain of radio astronomical abstracts. We evaluate several committee-based metrics for quantifying the disagreement between classifiers built using multiple views, and demonstrate that the choice of metric can be optimised in simulation experiments with existing annotated data from different domains. A final evaluation shows that we gained substantial savings compared to a randomly sampled baseline. 1
- …